DCTNet and PCANet for acoustic signal feature extraction
نویسندگان
چکیده
We introduce the use of DCTNet, an efficient approximation and alternative to PCANet, for acoustic signal classification. In PCANet, the eigenfunctions of the local sample covariance matrix (PCA) are used as filterbanks for convolution and feature extraction. When the eigenfunctions are well approximated by the Discrete Cosine Transform (DCT) functions, each layer of of PCANet and DCTNet is essentially a time-frequency representation. We relate DCTNet to spectral feature representation methods, such as the the short time Fourier transform (STFT), spectrogram and linear frequency spectral coefficients (LFSC). Experimental results on whale vocalization data show that DCTNet improves classification rate, demonstrating DCTNet’s applicability to signal processing problems such as underwater acoustics.
منابع مشابه
Comparative Analysis of Wavelet-based Feature Extraction for Intramuscular EMG Signal Decomposition
Background: Electromyographic (EMG) signal decomposition is the process by which an EMG signal is decomposed into its constituent motor unit potential trains (MUPTs). A major step in EMG decomposition is feature extraction in which each detected motor unit potential (MUP) is represented by a feature vector. As with any other pattern recognition system, feature extraction has a significant impac...
متن کاملمدلسازی بازشناسی واجی کلمات فارسی
Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...
متن کاملStacked Approximated Regression Machine: A Simple Deep Learning Approach
This paper proposes the Stacked Approximated Regression Machine (SARM), a novel, simple yet powerful deep learning (DL) baseline. We start by discussing the relationship between regularized regression models and feed-forward networks, with emphasis on the non-negative sparse coding and convolutional sparse coding models. We demonstrate how these models are naturally converted into a unified fee...
متن کاملPCANet-II: When PCANet Meets the Second Order Pooling
PCANet, as one noticeable shallow network, employs the histogram representation for feature pooling. However, there are three main problems about this kind of pooling method. First, the histogram-based pooling method binarizes the feature maps and leads to inevitable discriminative information loss. Second, it is difficult to effectively combine other visual cues into a compact representation, ...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1605.01755 شماره
صفحات -
تاریخ انتشار 2016